Improving Compression of Short Messages
نویسندگان
چکیده
Compression of short text strings, such as the GSM Short Message Service (SMS) and Twitter messages, has received relatively little attention compared to the compression of longer texts. This is not surprising given that for typical cellular and internet-based networks, the cost of compression probably outweighs the cost of delivering uncompressed messages. However, this is not necessarily true in the case where the cost of data transport is high, for example, where satellite back-haul is involved, or on bandwidth-starved mobile mesh networks, such as the mesh networks for disaster relief, rural, remote and developing contexts envisaged by the Serval Project [1-4]. This motivated the development of a state-of-art text compression algorithm that could be used to compress mesh-based short-message traffic, culminating in the development of the stats3 SMS compression scheme described in this paper. Stats3 uses word frequency and 3rd-order letter statistics embodied in a pre-constructed dictionary to affect lossless compression of short text messages. This scheme shows that our scheme compressing text messages typically reduces messages to less than half of their original size, and in so doing substantially outperforms all public SMS compression systems, while also matching or exceeding the marketing claims of the commercial options known to the authors. We also outline approaches for future work that has the potential to further improve the performance and practical utility of stats3.
منابع مشابه
The Effect of Short Message Service on Knowledge of Patients with Diabetes in Yazd, Iran
OBJECTIVE: Diabetes mellitus has shown a tremendous health and social burden worldwide. Better glycemic control in patients with diabetes can be achieved by improving their knowledge which consequently will prevent developing microvascular and neurological complications. Some studies demonstrate effectiveness of Short Message Service (SMS) for patient education. Regarding exponential growth in ...
متن کاملLossless Message Compression: Improving Throughput and Network Load
Upgrading the network infrastructure to increase the available bandwidth can be expensive and time consuming. In this paper we investigated whether applying compression to inter-process communication (IPC) messages, in order to increase the effective throughput of an existing network, can be beneficial or not. A literature study on lossless compression was used to select a few algorithms to be ...
متن کاملFinding Second Preimages of Short Messages for Hamsi-256
In this paper we study the second preimage resistance of Hamsi-256, a second round SHA-3 candidate. We show that it is possible to find affine equations between some input bits and some output bits on the 3-round compression function. This property enables an attacker to find pseudo preimages for the Hamsi-256 compression function. The pseudo preimage algorithm can be used to find second preima...
متن کاملLow Complex and Power Efficient Text Compressor for Cellular and Sensor Networks
This paper investigates a novel text compressor for short text messages. Short text messages are used in cellular and sensor networks. The terminals or nodes are battery driven and compression is needed to save energy or to use bandwidth in an efficient manner. The presented work focuses on the tradeoff between compression and energy consumption. Examples are shown where the energy to compress ...
متن کاملCompression textuelle sur la base de règles issues d'un corpus de sms (Textual Compression Based on Rules Arising from a Corpus of Text Messages) [in French]
Textual Compression Based on Rules Arising from a Corpus of Text Messages The present research seeks to reduce the size of text messages on the basis of compression techniques observed mostly in a corpus of sms. This paper explains the methodology followed to establish compression rules. It then presents the 33 considered rules, and illustrates the four suggested levels of compression with two ...
متن کامل